Feature-Based Segmentation Of Narrative Documents
نویسندگان
چکیده
In this paper we examine topic segmentation of narrative documents, which are characterized by long passages of text with few headings. We first present results suggesting that previous topic segmentation approaches are not appropriate for narrative text. We then present a featurebased method that combines features from diverse sources as well as learned features. Applied to narrative books and encyclopedia articles, our method shows results that are significantly better than previous segmentation approaches. An analysis of individual features is also provided and the benefit of generalization using outside resources is shown.
منابع مشابه
Document Analysis And Classification Based On Passing Window
In this paper we present Document analysis and classification system to segment and classify contents of Arabic document images. This system includes preprocessing, document segmentation, feature extraction and document classification. A document image is enhanced in the preprocessing by removing noise, binarization, and detecting and correcting image skew. In document segmentation, an algorith...
متن کاملAutomatic Prostate Cancer Segmentation Using Kinetic Analysis in Dynamic Contrast-Enhanced MRI
Background: Dynamic contrast enhanced magnetic resonance imaging (DCE-MRI) provides functional information on the microcirculation in tissues by analyzing the enhancement kinetics which can be used as biomarkers for prostate lesions detection and characterization.Objective: The purpose of this study is to investigate spatiotemporal patterns of tumors by extracting semi-quantitative as well as w...
متن کاملSegmentation of text/image documents using texture approaches
The digital computer and computer networks have made it possible to search for and retrieve electronically stored documents in seconds, no matter where in the world they are stored. This is far from the reality for documents stored as paper copies. Therefore there is considerable interest in digitizing paper documents. To digitize existing paper documents, it is of great importance to be able t...
متن کاملPerformance Analysis of Segmentation of Hyperspectral Images Based on Color Image Segmentation
Image segmentation is a fundamental approach in the field of image processing and based on user’s application .This paper propose an original and simple segmentation strategy based on the EM approach that resolves many informatics problems about hyperspectral images which are observed by airborne sensors. In a first step, to simplify the input color textured image into a color image without tex...
متن کاملPoorly Structured Handwritten Documents Segmentation using Continuous Probabilistic Feature Grammars
This work deals with poorly structured handwritten documents segmentation such as pages of handwritten notes produced with pen-based interfaces. We propose to use a formalism, based on Probabilistic Feature Grammars, that exhibit some interesting features. It allows handling ambiguities and to taking into account contextual information such as spatial relations between objects in the page.
متن کامل